Large Scale Video Analysis Platform |
Return |
Introduction
Big video data processor platform is built with the Hadoop ecosystem, mainly for off-line batch processing tasks of massive video data. This platform involves structured, semi-structured and non-storage and management of structured data, such as video data, labels, wrapped text, key frames and the frame features.
Framework
- Data transmission component
- Distributed storage
- Distributed computing
- System monitoring management
- Unified deployment platform
The platform includes the followings:
Platform developed a variety of data transmission components for data transmission: 1. cross-platform clients, which are similar to the "cloud drive" on Linux and Windows, can be used as the local disks or folders; 2. ETL tools, data models can be switched among structured, semi-structured and unstructured data by ETL tools; 3. the crawler, our laboratory has developed a crawler system for Internet data, with which we can obtain the Internet text, images, audio and video data; 4. storage API, users can also use HDFS standard API for data transmission.
The platform combines various storage means for storing data. The platform stores the video data in HDFS for quick data accessing. NoSQL database HBase has high scalability, which is suitable for storing sparse data. The platform stores semi-structured and unstructured data such as the labels, wrap text, key frames, the frame features in Hbase.
With Hadoop MapReduce mechanism we develop distributed computing framework. For massive video distributed batch processing applications, the platform based on Hadoop Streaming streams processing package, develops an "embedded" video processing components and a unified interface. Stand-alone video processing algorithms are packaged, implemented and eventually transferred into executable files, which can be called by the video processing platform interface without any changes, then finally comes to the distributed video processing. Typical massive video distributed batch processing applications include video summarization, video transcoder and video detection.
Ganglia monitoring tools are attached to the system in order to monitor the distributed cluster, including CPU, throughput, memory and hard disk. When a node fails, Ganglia will inform the user.
The platform provides a complete set of automatic deployment tools. Tools deploy a machine with the operating system, the Hadoop ecosystem and the entire video data processor infrastructure platform.
Patents:
- Guiguang Ding, Xinpeng Dong, Yan Yu, Jile Zhou: A method and system about the network file system based on usb. Patent in China. 201310329823.4. 2013.11.20. [丁贵广, 董欣鹏, 于琰, 周继乐. 基于U盘的网络文件系统的挂载方法及系统: 中国, 201310329823.4. 2013.11.20.(中国专利申请号.)]